LINDA+ is a method used to identify intra- and inter-cellular interaction networks at a domain resolution, allowing to further study the effects of AS on signalling. It is a tool that integrates prior knowledge (PKN) of directed joint interactions (protein-protein and Domain-Domain intra-cellular interactions from DIGGER (Louadi et al. 2021)) as well as structurally resolved ligand-receptor pairs from various resources (list of resources when ready) with estimated transcription factor enzymatic activities from DoRothEA (Garcia-Alonso et al. 2019) based on sc-RNAseq reads and optionally abundances of secreted extra-cellular proteins from Secretomics data analysis. This is achieved by contextualising large-scale prior knowledge of joint interactions from sequencing and MS/MS data in order to identify a subset of functional interactions between proteins and their domains through the implementation of an Integer Linear Programming (ILP) formulation. The pipeline of the LINDA+ methodology has been depicted in Figure 1.
Figure 1: LINDA+ pipeline
1. From the DoRothEA (Garcia-Alonso et al., 2018) resource, normalized transcription factor (TF) enrichment scores (NESs) can be estimated with viper (Shen et al. 2014; Alvarez et al. 2016). Based on their activities, the most regulated TF’s can then be used as the bottom layer of signalling from where we reverse-engineer the upstream regulatory interactions.
2. Abundances of ligands obtained from LC-MS secretomics. TBD: How to use secretomics data as inputs.
3. A joint network graph from DIGGER (Louadi et al. 2021) that integrates PPIs and DDIs and in which nodes represent protein domains defined by concatenating Entrez and Pfam id. The resource contains database tables for both genomic data, e.g. genes with their corresponding transcript and exon coordinates, and for proteins, e.g. isoforms and their domains. The protein coordinates were converted to genomic coordinates in the coding sequence and both tables are merged to be able to map transcripts with their corresponding exons to the corresponding protein isoforms and Pfam domains.
4. Structurally resolved ligand-receptor pairs from various resources. TBD: What resources to use
Below are provided examples which help the user to better understand the functioning of the LINDA+ R-package and the format of the LINDA+ inputs.
Below are provided the steps of running a small Toy test study which we have depicted in Figure 2. In this Toy case-study we are depicting a system of 3 different cell-types (CellA, CellB and CellC) where each node represents a protein (domains not depicted) while edges represent interactions between nodes. Yellow triangles represent cell-receptors, red squares represent intra-cellular proteins, green triangles represent TF’s, while in blue circles we are depicting ligands in the extra-cellular space. The interactions between nodes in the network is what we have obtained from utilizing large-scale resources such as DIGGER for the intra-cellular PPI and DDI’s (Louadi et al. 2021); DoRothEA (Garcia-Alonso et al., 2018) for the TF-to-ligand interactions which connect the intra-cellular space with the extra-cellular space; and resources such as LIANA+ (Dimitrov et al. 2023) or CellPhoneDB (Efremova et al. 2020) that we use as prior knowledge for the ligand-receptor interactions.
Figure 2: LINDA+ Toy Example
Please not that for simplicity reasons in Figure 2 we are depicting each protein as one node in the network, however LINDA+ models interactions between proteins at a domain resolution.
R-packages needed to for the analysis.
library(LINDAPlus)The list consists of two elements:
background.networks: This should be a named (cell-types) list containing data-frames joint PPI-DDI’s for each cell-type. Each data-frame represents the set of interaction knowledge for each cell-type and it should contain at least 4 columns with the following ID’s: ‘pfam_source’, ‘pfam_target’, ‘gene_source’ and ‘gene_target’. In the case where we have no name for a specific domain or where this is not applicable, please set the corresponding values in the ‘pfam_source’ or ‘pfam_target’ as NA’s.
ligand.receptors: This also should be a named list (‘ligands’ and ‘Receptors’) which contains character vectors where the set of elements that corresponds to Ligands and Receptors have been defined as such.
load(file = system.file("extdata", "toy.background.networks.list.RData", package = "LINDAPlus"))
print(background.networks.list)## $background.networks
## $background.networks$CellA
## pfam_source pfam_target gene_source gene_target
## 2 D2 D14 A1 A4
## 3 D4 D18 A2 A5
## 4 D6 D15 A2 A5
## 5 D10 D21 A3 A6
## 6 D9 D19 A3 A6
## 7 D8 D20 A3 A6
## 9 D12 D47 A4 A7
## 10 D13 D46 A4 A7
## 11 D14 D25 A4 A8
## 12 D13 D23 A4 A8
## 13 D16 D24 A5 A8
## 14 D20 D26 A6 A9
## 15 D20 D32 A6 A10
## 16 D19 D31 A6 A10
## 17 D22 D30 A6 A10
## 18 D23 D36 A8 A11
## 19 D23 D38 A8 A12
## 20 D24 D37 A8 A12
## 21 D28 D40 A9 A12
## 22 D32 D43 A10 A13
## 23 D31 D41 A10 A13
## 24 D30 D42 A10 A13
## 25 D34 D48 A11 A14
## 26 D35 D52 A11 A15
## 27 D34 D54 A11 A15
## 28 D36 D53 A11 A15
## 29 D34 D55 A11 A17
## 30 D38 D51 A12 A14
## 31 D39 D48 A12 A14
## 32 D40 D50 A12 A14
## 33 D37 D54 A12 A15
## 34 D39 D59 A12 A16
## 35 D44 D61 A13 A16
## 36 D41 D60 A13 A16
## 37 D41 D57 A13 A17
## 38 D42 D55 A13 A17
## 39 D43 D56 A13 A17
## 1 <NA> <NA> A14 L1
## 110 <NA> D1 L2 A1|A2
## 210 <NA> D3 L2 A1|A2
## 310 <NA> D5 L2 A1|A2
## 41 <NA> D7 L4 A3
##
## $background.networks$CellB
## pfam_source pfam_target gene_source gene_target
## 2 D3 D7 B1 B3
## 3 D2 D8 B1 B3
## 4 D2 D12 B1 B4
## 6 D5 D10 B2 B4
## 7 D6 D13 B2 B4
## 8 D4 D12 B2 B4
## 9 D5 D17 B2 B5
## 10 D6 D15 B2 B5
## 11 D8 D20 B3 B6
## 12 D9 D18 B3 B6
## 13 D7 D19 B3 B6
## 14 D8 D23 B3 B7
## 15 D7 D21 B3 B7
## 16 D9 D22 B3 B7
## 17 D12 D26 B4 B8
## 18 D13 D25 B4 B8
## 19 D11 D27 B4 B8
## 20 D16 D28 B5 B9
## 21 D15 D30 B5 B9
## 22 D14 D29 B5 B9
## 23 D20 D33 B6 B10
## 24 D19 D31 B6 B10
## 25 D23 D37 B7 B11
## 26 D22 D35 B7 B11
## 27 D26 D37 B8 B11
## 28 D25 D38 B8 B11
## 29 D24 D36 B8 B11
## 30 D25 D39 B8 B12
## 31 D30 D35 B9 B11
## 32 D31 D43 B10 B13
## 33 D34 D45 B10 B13
## 34 D32 D48 B10 B14
## 35 D31 D50 B10 B14
## 36 D34 D47 B10 B14
## 37 D37 D50 B11 B14
## 38 D38 D49 B11 B14
## 39 D37 D53 B11 B15
## 40 D38 D51 B11 B15
## 41 D35 D54 B11 B15
## 42 D40 D52 B12 B15
## 43 D41 D54 B12 B15
## 44 D39 D51 B12 B15
## 45 D39 D56 B12 B16
## 46 D40 D55 B12 B16
## 1 <NA> <NA> B13 L1
## 210 <NA> <NA> B16 L2
## 110 <NA> D1 L3 B1
## 211 <NA> D4 L4 B2
##
## $background.networks$CellC
## pfam_source pfam_target gene_source gene_target
## 2 D1 D8 C1 C3
## 3 D2 D10 C1 C4
## 4 D1 D12 C1 C4
## 5 D3 D14 C2 C5
## 6 D6 D16 C2 C5
## 7 D4 D15 C2 C5
## 8 D7 D18 C3 C6
## 9 D8 D17 C3 C6
## 10 D8 D22 C3 C7
## 11 D7 D24 C3 C7
## 12 D9 D21 C3 C7
## 13 D11 D21 C4 C7
## 14 D13 D23 C4 C7
## 15 D15 D28 C5 C8
## 16 D16 D25 C5 C8
## 17 D14 D26 C5 C8
## 18 D14 D31 C5 C9
## 19 D16 D30 C5 C9
## 20 D17 D36 C6 C11
## 21 D18 D35 C6 C11
## 22 D22 D34 C7 C10
## 23 D23 D32 C7 C10
## 24 D22 D36 C7 C11
## 25 D24 D35 C7 C11
## 26 D27 D35 C8 C11
## 27 D29 D40 C9 C12
## 28 D31 D38 C9 C12
## 29 D34 D41 C10 C13
## 30 D33 D42 C10 C13
## 31 D32 D44 C10 C13
## 32 D32 D47 C10 C14
## 33 D34 D48 C10 C14
## 34 D36 D40 C11 C12
## 35 D35 D38 C11 C12
## 36 D37 D49 C11 C15
## 37 D36 D50 C11 C15
## 38 D36 D55 C11 C16
## 39 D39 D50 C12 C15
## 40 D40 D49 C12 C15
## 41 D38 D52 C12 C15
## 42 D39 D53 C12 C16
## 43 D40 D54 C12 C16
## 44 D38 D55 C12 C16
## 45 D43 D57 C13 L4
## 46 D44 D56 C13 L4
## 47 D41 D58 C13 L4
## 1 <NA> <NA> C13 L3
## 210 <NA> <NA> C13 L4
## 310 <NA> <NA> <NA> <NA>
## 110 <NA> D3 L1 C1
## 211 <NA> D4 L1 C2
##
##
## $ligands.receptors
## $ligands.receptors$ligands
## [1] "L1" "L2" "L3" "L4"
##
## $ligands.receptors$receptors
## [1] "A1|A2" "A3" "B1" "B2" "C1" "C2"
Please note that receptors that that consist of multi-subunit protein complexes (i.e. 2 or more protein units), should be depicted as separated through ‘|’ symbol (i.e. receptor complex ‘A1|A2’ which consists of proteins A1 and A2).
This information should be provided as a named list (for each cell-type) and which contains data-frames indicating the enrichment scores for each TF at each cell-type. The data-frames should contain at least two columns: ‘tf’ (indicating the TF ID) and the numerical ‘score’ (indicating the enrichment scores for each TF).
load(file = system.file("extdata", "toy.tf.scores.RData", package = "LINDAPlus"))
print(tf.scores)## $CellA
## tf score
## 1 A14 1
## 2 A15 0
## 3 A16 0
## 4 A17 1
##
## $CellB
## tf score
## 1 B13 1
## 2 B14 1
## 3 B15 0
## 4 B16 1
##
## $CellC
## tf score
## 1 C13 1
## 2 C14 0
## 3 C15 1
## 4 C16 0
Additionally users can provide a named (also by cell-type) numerical vector to indicate the number of TF’s to consider as significantly regulated based on their absolute enrichment values. In case that this parameter has not been defined, then by default all the TF’s provided in the data-frames list will be considered as significantly regulated.
load(file = system.file("extdata", "toy.top.tf.RData", package = "LINDAPlus"))
print(top.tf)## CellA CellB CellC
## 2 3 2
In principle, to run LINDA+, users do not necessairly need to provide prior information about receptors or ligand-receptor interactions that appear to be regulated or enriched. LINDA+ can such potentially regulated mechanisms from the provided TF activities alone:
res <- runLINDAPlus(background.networks.list = background.networks.list,
tf.scores = tf.scores,
solverPath = "~/Downloads/cplex",
top.tf = top.tf)## [1] "Writing the objective function and constraints. This might take a bit of time.."
print(res$combined_solutions)## $CellA
## source weight target reaction
## [1,] "A1" "58" "A4" "D2=D14"
## [2,] "A4" "58" "A8" "D14=D25; D13=D23"
## [3,] "A8" "58" "A11" "D23=D36"
## [4,] "A11" "58" "A14" "D34=D48"
## [5,] "A11" "58" "A17" "D34=D55"
## [6,] "L2" "58" "A1|A2" "PSEUDODOMAIN=D5; PSEUDODOMAIN=D3; PSEUDODOMAIN=D1"
## [7,] "A14" "58" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [8,] "A1|A2" "58" "A1" "D2=D2"
##
## $CellB
## source weight target reaction
## [1,] "B1" "58" "B3" "D2=D8; D3=D7"
## [2,] "B1" "58" "B4" "D2=D12"
## [3,] "B3" "58" "B6" "D7=D19; D8=D20"
## [4,] "B4" "58" "B8" "D13=D25; D12=D26"
## [5,] "B6" "58" "B10" "D19=D31; D20=D33"
## [6,] "B8" "58" "B12" "D25=D39"
## [7,] "B10" "58" "B13" "D31=D43; D34=D45"
## [8,] "B10" "58" "B14" "D31=D50; D34=D47"
## [9,] "B12" "58" "B16" "D39=D56"
## [10,] "L3" "58" "B1" "PSEUDODOMAIN=D1"
## [11,] "B13" "58" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [12,] "B16" "58" "L2" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellC
## source weight target reaction
## [1,] "C1" "58" "C3" "D1=D8"
## [2,] "C3" "58" "C7" "D8=D22"
## [3,] "C7" "58" "C10" "D22=D34"
## [4,] "C7" "58" "C11" "D22=D36"
## [5,] "C10" "58" "C13" "D34=D41"
## [6,] "C11" "58" "C15" "D36=D50"
## [7,] "L1" "58" "C1" "PSEUDODOMAIN=D3"
## [8,] "C13" "58" "L3" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $ligand_receptors
## source weight target reaction
## [1,] "L2" "58" "A1|A2" "PSEUDODOMAIN=D5; PSEUDODOMAIN=D3; PSEUDODOMAIN=D1"
## [2,] "L3" "58" "B1" "PSEUDODOMAIN=D1"
## [3,] "L1" "58" "C1" "PSEUDODOMAIN=D3"
In Figure 3, is depicted the network solution provided by LINDA+.
Figure 3: LINDA+ Toy Example - Solution 1
Users can provide information about the abundance of ligands in the extra-cellular space as made evident by Secretomics data through a data-frame object. More abundant ligands/extra-cellular molecules are more likely to initiate conformational changes in receptors. The data-frame provided should contain two columns: ‘ligands’ (providing the ligand ID’s) and ‘score’ (providing the score associated to each ligand, i.e. abundance). The higher the score of the ligand, the more likely it will be for a ligand to appear in the solution. In this case, we penalize the inclusion of ligand L3 in the solution (lower score value given).
load(file = system.file("extdata", "toy.ligand.scores.RData", package = "LINDAPlus"))
print(ligand.scores)## ligand score
## 1 L1 1
## 2 L2 1
## 3 L3 0
## 4 L4 1
res <- runLINDAPlus(background.networks.list = background.networks.list,
tf.scores = tf.scores,
solverPath = "~/Downloads/cplex",
top.tf = top.tf,
ligand.scores = ligand.scores,
lambda1 = 10,
lambda2 = 15)## [1] "Writing the objective function and constraints. This might take a bit of time.."
print(res$combined_solutions)## $CellA
## source weight target reaction
## [1,] "A3" "1" "A6" "D8=D20"
## [2,] "A6" "1" "A9" "D20=D26"
## [3,] "A6" "1" "A10" "D20=D32"
## [4,] "A9" "1" "A12" "D28=D40"
## [5,] "A10" "1" "A13" "D32=D43"
## [6,] "A12" "1" "A14" "D40=D50"
## [7,] "A13" "1" "A17" "D43=D56"
## [8,] "L4" "1" "A3" "PSEUDODOMAIN=D7"
## [9,] "A14" "9" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [10,] "A1" "8" "A4" "D2=D14"
## [11,] "A4" "8" "A8" "D14=D25; D13=D23"
## [12,] "A8" "8" "A11" "D23=D36"
## [13,] "A11" "8" "A14" "D34=D48"
## [14,] "A11" "8" "A17" "D34=D55"
## [15,] "L2" "8" "A1|A2" "PSEUDODOMAIN=D1"
## [16,] "A1|A2" "8" "A1" "D2=D2"
##
## $CellB
## source weight target reaction
## [1,] "B2" "9" "B4" "D4=D12; D6=D13"
## [2,] "B4" "9" "B8" "D13=D25; D12=D26"
## [3,] "B8" "9" "B11" "D25=D38; D26=D37"
## [4,] "B8" "9" "B12" "D25=D39"
## [5,] "B11" "9" "B14" "D38=D49; D37=D50"
## [6,] "B12" "9" "B16" "D39=D56"
## [7,] "L4" "9" "B2" "PSEUDODOMAIN=D4"
## [8,] "B16" "9" "L2" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellC
## source weight target reaction
## [1,] "C1" "9" "C3" "D1=D8"
## [2,] "C3" "9" "C7" "D8=D22"
## [3,] "C7" "9" "C10" "D22=D34"
## [4,] "C7" "9" "C11" "D22=D36"
## [5,] "C10" "9" "C13" "D34=D41"
## [6,] "C11" "9" "C15" "D36=D50"
## [7,] "C13" "9" "L4" "PSEUDODOMAIN=PSEUDODOMAIN"
## [8,] "L1" "9" "C1" "PSEUDODOMAIN=D3"
##
## $ligand_receptors
## source weight target reaction
## [1,] "L4" "1" "A3" "PSEUDODOMAIN=D7"
## [2,] "L4" "9" "B2" "PSEUDODOMAIN=D4"
## [3,] "L1" "9" "C1" "PSEUDODOMAIN=D3"
## [4,] "L2" "8" "A1|A2" "PSEUDODOMAIN=D1"
In Figure 4 we depict how giving a priority of ligand L4 in the solution over ligand L3, will lead to the re-wring of the interaction in our multi-cellular system as compared to when not considering for such scores in Figure 3.
Users can tweak the parameters lambda1 and lambda2 to define for how much they wish to prioritize the inclusion of TF’s over the significantly expressed ligands in the extra-cellular space. Here we are using the default set values of lambda1=10 and lambda2=15.
Figure 4: LINDA+ Toy Example - Solution 2
In the case when Secretomics data is not available, users may still have evidence of functional ligand-receptor interactions. This evidence can come from previous ligand-enrichment analyses. In such cases, users can integrate the enrichment scores for ligand-receptor pairs into LINDA. Doing so helps guide the network inference towards the desired ligand-receptor interactions. In this case users can provide a named list (for each cell-type) of data-frames consisting of ligand-receptor enrichment scores normalized between the values of -1 and 1.
load(file = system.file("extdata", "toy.lr.scores.RData", package = "LINDAPlus"))
print(lr.scores)## $CellA
## lr.interaction score
## 1 L2=A1|A2 0.42
## 2 L4=A3 0.78
##
## $CellB
## lr.interaction score
## 1 L3=B1 0.03
## 2 L4=B2 0.97
##
## $CellC
## lr.interaction score
## 1 L1=C1 0.22
## 2 L1=C2 0.87
Here we can see that for example, for CellA we are prioritizing the L4->A3 interaction compared to L2->A1|A2. This is reflected in the re-wiring of the interactions compared to the solution that we obtained in Figure 2 as follows:
res <- runLINDAPlus(background.networks.list = background.networks.list,
tf.scores = tf.scores,
solverPath = "~/Downloads/cplex",
top.tf = top.tf,
lr.scores = lr.scores,
lambda1 = 10,
lambda3 = 1)## [1] "Writing the objective function and constraints. This might take a bit of time.."
print(res$combined_solutions)## $CellA
## source weight target reaction
## [1,] "A3" "10" "A6" "D8=D20"
## [2,] "A6" "10" "A9" "D20=D26"
## [3,] "A6" "10" "A10" "D20=D32"
## [4,] "A9" "10" "A12" "D28=D40"
## [5,] "A10" "10" "A13" "D32=D43"
## [6,] "A12" "10" "A14" "D40=D50"
## [7,] "A13" "10" "A17" "D43=D56"
## [8,] "L4" "10" "A3" "PSEUDODOMAIN=D7"
## [9,] "A14" "10" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellB
## source weight target reaction
## [1,] "B1" "10" "B3" "D2=D8; D3=D7"
## [2,] "B1" "10" "B4" "D2=D12"
## [3,] "B3" "10" "B6" "D7=D19; D8=D20"
## [4,] "B4" "10" "B8" "D13=D25; D12=D26"
## [5,] "B6" "10" "B10" "D19=D31; D20=D33"
## [6,] "B8" "10" "B12" "D25=D39"
## [7,] "B10" "10" "B13" "D31=D43; D34=D45"
## [8,] "B10" "10" "B14" "D31=D50; D34=D47"
## [9,] "B12" "10" "B16" "D39=D56"
## [10,] "L3" "10" "B1" "PSEUDODOMAIN=D1"
## [11,] "B13" "10" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [12,] "B16" "10" "L2" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellC
## source weight target reaction
## [1,] "C1" "10" "C3" "D1=D8"
## [2,] "C3" "10" "C7" "D8=D22"
## [3,] "C7" "10" "C10" "D22=D34"
## [4,] "C7" "10" "C11" "D22=D36"
## [5,] "C10" "10" "C13" "D34=D41"
## [6,] "C11" "10" "C15" "D36=D50"
## [7,] "C13" "10" "L4" "PSEUDODOMAIN=PSEUDODOMAIN"
## [8,] "L1" "10" "C1" "PSEUDODOMAIN=D3"
## [9,] "C13" "10" "L3" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $ligand_receptors
## source weight target reaction
## [1,] "L4" "10" "A3" "PSEUDODOMAIN=D7"
## [2,] "L3" "10" "B1" "PSEUDODOMAIN=D1"
## [3,] "L1" "10" "C1" "PSEUDODOMAIN=D3"
We also observe that for CellB while indeed we have provided a higher enrichment score to the L4->B2 interaction compared L3->B1, we still see that L4->B2 is not present in the final solution. This is because we are assigning a relatively low weight (lambda3=1) to the inclusion of ligand-receptor interactions based on enrichment analyses compared to the inclusion of TF’s (lambda1=10). By giving larger value to lambda3, we can make it possible to retreive the L4->B2 interaction.
Users can additionally provide scores (between 0 and 1) representing probability values about how likely would be for two cell-types to directly communicate with each-other (i.e. as made evident through spatial transcriptomics). The higher the score given, the more is likely for a cell-type pair to be directly communicating with each other, and when such score is set to 0, then special constraints will make it so these two cell-types would not be able to directly communicate with each-other. Let’s see how such scores can be provided:
load(file = system.file("extdata", "toy.ccc.scores.RData", package = "LINDAPlus"))
print(ccc.scores)## ccc score
## 1 CellA=CellB 0.5
## 2 CellA=CellC 0.0
## 3 CellB=CellC 0.5
From the example given, we can see that we have set that the probability of communicating between CellA and CellB is set to 0. Figure 6 depicts how providing such scores affects the network solution:
res <- runLINDAPlus(background.networks.list = background.networks.list,
tf.scores = tf.scores,
solverPath = "~/Downloads/cplex",
top.tf = top.tf,
ccc.scores = ccc.scores,
lambda1 = 10,
lambda3 = 1)## [1] "Writing the objective function and constraints. This might take a bit of time.."
print(res$combined_solutions)## $CellA
## source weight target reaction
## [1,] "A1" "8" "A4" "D2=D14"
## [2,] "A4" "8" "A8" "D13=D23; D14=D25"
## [3,] "A8" "8" "A11" "D23=D36"
## [4,] "A11" "8" "A17" "D34=D55"
## [5,] "L2" "8" "A1|A2" "PSEUDODOMAIN=D3; PSEUDODOMAIN=D5"
## [6,] "A1|A2" "8" "A1" "D2=D2"
##
## $CellB
## source weight target reaction
## [1,] "B1" "8" "B3" "D2=D8; D3=D7"
## [2,] "B1" "8" "B4" "D2=D12"
## [3,] "B3" "8" "B6" "D8=D20; D7=D19"
## [4,] "B4" "8" "B8" "D12=D26"
## [5,] "B6" "8" "B10" "D19=D31; D20=D33"
## [6,] "B8" "8" "B12" "D25=D39"
## [7,] "B10" "8" "B13" "D31=D43"
## [8,] "B10" "8" "B14" "D31=D50"
## [9,] "B12" "8" "B16" "D39=D56"
## [10,] "L3" "8" "B1" "PSEUDODOMAIN=D1"
## [11,] "B13" "8" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [12,] "B16" "8" "L2" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellC
## source weight target reaction
## [1,] "C1" "8" "C3" "D1=D8"
## [2,] "C3" "8" "C7" "D8=D22"
## [3,] "C7" "8" "C10" "D22=D34"
## [4,] "C7" "8" "C11" "D22=D36"
## [5,] "C10" "8" "C13" "D34=D41"
## [6,] "C11" "8" "C15" "D36=D50"
## [7,] "L1" "8" "C1" "PSEUDODOMAIN=D3"
## [8,] "C13" "8" "L3" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $ligand_receptors
## source weight target reaction
## [1,] "L2" "8" "A1|A2" "PSEUDODOMAIN=D3; PSEUDODOMAIN=D5"
## [2,] "L3" "8" "B1" "PSEUDODOMAIN=D1"
## [3,] "L1" "8" "C1" "PSEUDODOMAIN=D3"
As we can notice, all the extra-cellular interactions which make possible the communication between CellA and CellC are not present in the solution.
Given that LINDA+ simultaneously infers not only protein-protein interactions but also domain interactions, it enables us to examine how RNA modification mechanisms, like alternative splicing, might influence the presence or absence of domains within the structure of interacting proteins. This, in turn, allows us to assess the effects of such modifications on the interactions between proteins.
This is achieved by giving to the network inference function an as.input data-frame object which lists domain ID’s of certain proteins for any cell-type and how they have been affected based on, for example, evidence from differential splicing analyses. These effects can include exclusion (when we know that a domain of a protein has been skipped) or inclusion (when we try to understand how the inclusion of a domain in the network solution might affect the protein interactions).
In the toy example below it can be demonstrated how the as.input object should be defined.
load(file = system.file("extdata", "toy.as.input.RData", package = "LINDAPlus"))
print(as.input)## cell_type proteinID domainID effect
## 1 CellA A8 D23 exclusion
## 2 CellA A9 D26 inclusion
## 3 CellA A9 D28 inclusion
Such an object, can then be given as an input to the main runLINDAPlus() function in order to infer splice-dependent mechanisms of protein interactions.
res <- runLINDAPlus(background.networks.list = background.networks.list,
tf.scores = tf.scores,
as.input = as.input,
solverPath = "~/Downloads/cplex",
top.tf = top.tf)## [1] "3 domains out of 3 total given in the 'as.input' have been found in the background network."
## [1] "Writing the objective function and constraints. This might take a bit of time.."
print(res$combined_solutions)## $CellA
## source weight target reaction
## [1,] "A3" "7" "A6" "D8=D20"
## [2,] "A6" "7" "A9" "D20=D26"
## [3,] "A6" "7" "A10" "D20=D32"
## [4,] "A9" "7" "A12" "D28=D40"
## [5,] "A10" "7" "A13" "D32=D43"
## [6,] "A12" "7" "A14" "D40=D50"
## [7,] "A13" "7" "A17" "D43=D56"
## [8,] "L4" "7" "A3" "PSEUDODOMAIN=D7"
## [9,] "A14" "7" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellB
## source weight target reaction
## [1,] "B1" "7" "B3" "D2=D8; D3=D7"
## [2,] "B1" "7" "B4" "D2=D12"
## [3,] "B3" "7" "B6" "D7=D19; D8=D20"
## [4,] "B4" "7" "B8" "D13=D25; D12=D26"
## [5,] "B6" "7" "B10" "D19=D31; D20=D33"
## [6,] "B8" "7" "B12" "D25=D39"
## [7,] "B10" "7" "B13" "D31=D43; D34=D45"
## [8,] "B10" "7" "B14" "D31=D50; D34=D47"
## [9,] "B12" "7" "B16" "D39=D56"
## [10,] "L3" "7" "B1" "PSEUDODOMAIN=D1"
## [11,] "B13" "7" "L1" "PSEUDODOMAIN=PSEUDODOMAIN"
## [12,] "B16" "7" "L2" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $CellC
## source weight target reaction
## [1,] "C1" "7" "C3" "D1=D8"
## [2,] "C3" "7" "C7" "D8=D22"
## [3,] "C7" "7" "C10" "D22=D34"
## [4,] "C7" "7" "C11" "D22=D36"
## [5,] "C10" "7" "C13" "D34=D41"
## [6,] "C11" "7" "C15" "D36=D50"
## [7,] "C13" "7" "L4" "PSEUDODOMAIN=PSEUDODOMAIN"
## [8,] "L1" "7" "C1" "PSEUDODOMAIN=D3"
## [9,] "C13" "7" "L3" "PSEUDODOMAIN=PSEUDODOMAIN"
##
## $ligand_receptors
## source weight target reaction
## [1,] "L4" "7" "A3" "PSEUDODOMAIN=D7"
## [2,] "L3" "7" "B1" "PSEUDODOMAIN=D1"
## [3,] "L1" "7" "C1" "PSEUDODOMAIN=D3"
In the figure below we can see how the addition of information about included or excluded protein domains affects the re-wiring of the protein interactions.
Figure 7: LINDA+ Toy Example - Solution 5
Users can additionally check for the type/attributes of each node component in the generated networks. This would help for further visualization of the networks through Cytoscape or through the LINDAvis R-shiny App (under development).
print(res$node_attributes)## $CellA
## node attribute
## [1,] "A3" "receptor"
## [2,] "A1|A2" "receptor"
## [3,] "A1" "protein"
## [4,] "A2" "protein"
## [5,] "A4" "protein"
## [6,] "A5" "protein"
## [7,] "A6" "protein"
## [8,] "A8" "protein"
## [9,] "A9" "protein"
## [10,] "A10" "protein"
## [11,] "A11" "protein"
## [12,] "A12" "protein"
## [13,] "A13" "protein"
## [14,] "A7" "protein"
## [15,] "A14" "tf"
## [16,] "A15" "tf"
## [17,] "A17" "tf"
## [18,] "A16" "tf"
## [19,] "L2" "ligand"
## [20,] "L4" "ligand"
## [21,] "L1" "ligand"
## [22,] "D2" "domain"
## [23,] "D4" "domain"
## [24,] "D6" "domain"
## [25,] "D10" "domain"
## [26,] "D9" "domain"
## [27,] "D8" "domain"
## [28,] "D12" "domain"
## [29,] "D13" "domain"
## [30,] "D14" "domain"
## [31,] "D16" "domain"
## [32,] "D20" "domain"
## [33,] "D19" "domain"
## [34,] "D22" "domain"
## [35,] "D23" "domain"
## [36,] "D24" "domain"
## [37,] "D28" "domain"
## [38,] "D32" "domain"
## [39,] "D31" "domain"
## [40,] "D30" "domain"
## [41,] "D34" "domain"
## [42,] "D35" "domain"
## [43,] "D36" "domain"
## [44,] "D38" "domain"
## [45,] "D39" "domain"
## [46,] "D40" "domain"
## [47,] "D37" "domain"
## [48,] "D44" "domain"
## [49,] "D41" "domain"
## [50,] "D42" "domain"
## [51,] "D43" "domain"
## [52,] "PSEUDODOMAIN" "domain"
## [53,] "D18" "domain"
## [54,] "D15" "domain"
## [55,] "D21" "domain"
## [56,] "D47" "domain"
## [57,] "D46" "domain"
## [58,] "D25" "domain"
## [59,] "D26" "domain"
## [60,] "D48" "domain"
## [61,] "D52" "domain"
## [62,] "D54" "domain"
## [63,] "D53" "domain"
## [64,] "D55" "domain"
## [65,] "D51" "domain"
## [66,] "D50" "domain"
## [67,] "D59" "domain"
## [68,] "D61" "domain"
## [69,] "D60" "domain"
## [70,] "D57" "domain"
## [71,] "D56" "domain"
## [72,] "D1" "domain"
## [73,] "D3" "domain"
## [74,] "D5" "domain"
## [75,] "D7" "domain"
##
## $CellB
## node attribute
## [1,] "B1" "receptor"
## [2,] "B2" "receptor"
## [3,] "B3" "protein"
## [4,] "B4" "protein"
## [5,] "B5" "protein"
## [6,] "B6" "protein"
## [7,] "B7" "protein"
## [8,] "B8" "protein"
## [9,] "B9" "protein"
## [10,] "B10" "protein"
## [11,] "B11" "protein"
## [12,] "B12" "protein"
## [13,] "B13" "tf"
## [14,] "B16" "tf"
## [15,] "B14" "tf"
## [16,] "B15" "tf"
## [17,] "L3" "ligand"
## [18,] "L4" "ligand"
## [19,] "L1" "ligand"
## [20,] "L2" "ligand"
## [21,] "D3" "domain"
## [22,] "D2" "domain"
## [23,] "D5" "domain"
## [24,] "D6" "domain"
## [25,] "D4" "domain"
## [26,] "D8" "domain"
## [27,] "D9" "domain"
## [28,] "D7" "domain"
## [29,] "D12" "domain"
## [30,] "D13" "domain"
## [31,] "D11" "domain"
## [32,] "D16" "domain"
## [33,] "D15" "domain"
## [34,] "D14" "domain"
## [35,] "D20" "domain"
## [36,] "D19" "domain"
## [37,] "D23" "domain"
## [38,] "D22" "domain"
## [39,] "D26" "domain"
## [40,] "D25" "domain"
## [41,] "D24" "domain"
## [42,] "D30" "domain"
## [43,] "D31" "domain"
## [44,] "D34" "domain"
## [45,] "D32" "domain"
## [46,] "D37" "domain"
## [47,] "D38" "domain"
## [48,] "D35" "domain"
## [49,] "D40" "domain"
## [50,] "D41" "domain"
## [51,] "D39" "domain"
## [52,] "PSEUDODOMAIN" "domain"
## [53,] "D10" "domain"
## [54,] "D17" "domain"
## [55,] "D18" "domain"
## [56,] "D21" "domain"
## [57,] "D27" "domain"
## [58,] "D28" "domain"
## [59,] "D29" "domain"
## [60,] "D33" "domain"
## [61,] "D36" "domain"
## [62,] "D43" "domain"
## [63,] "D45" "domain"
## [64,] "D48" "domain"
## [65,] "D50" "domain"
## [66,] "D47" "domain"
## [67,] "D49" "domain"
## [68,] "D53" "domain"
## [69,] "D51" "domain"
## [70,] "D54" "domain"
## [71,] "D52" "domain"
## [72,] "D56" "domain"
## [73,] "D55" "domain"
## [74,] "D1" "domain"
##
## $CellC
## node attribute
## [1,] "C1" "receptor"
## [2,] "C2" "receptor"
## [3,] "C3" "protein"
## [4,] "C4" "protein"
## [5,] "C5" "protein"
## [6,] "C6" "protein"
## [7,] "C7" "protein"
## [8,] "C8" "protein"
## [9,] "C9" "protein"
## [10,] "C10" "protein"
## [11,] "C11" "protein"
## [12,] "C12" "protein"
## [13,] "C13" "tf"
## [14,] "C14" "tf"
## [15,] "C15" "tf"
## [16,] "C16" "tf"
## [17,] "L1" "ligand"
## [18,] "L4" "ligand"
## [19,] "L3" "ligand"
## [20,] "D1" "domain"
## [21,] "D2" "domain"
## [22,] "D3" "domain"
## [23,] "D6" "domain"
## [24,] "D4" "domain"
## [25,] "D7" "domain"
## [26,] "D8" "domain"
## [27,] "D9" "domain"
## [28,] "D11" "domain"
## [29,] "D13" "domain"
## [30,] "D15" "domain"
## [31,] "D16" "domain"
## [32,] "D14" "domain"
## [33,] "D17" "domain"
## [34,] "D18" "domain"
## [35,] "D22" "domain"
## [36,] "D23" "domain"
## [37,] "D24" "domain"
## [38,] "D27" "domain"
## [39,] "D29" "domain"
## [40,] "D31" "domain"
## [41,] "D34" "domain"
## [42,] "D33" "domain"
## [43,] "D32" "domain"
## [44,] "D36" "domain"
## [45,] "D35" "domain"
## [46,] "D37" "domain"
## [47,] "D39" "domain"
## [48,] "D40" "domain"
## [49,] "D38" "domain"
## [50,] "PSEUDODOMAIN" "domain"
## [51,] "D10" "domain"
## [52,] "D12" "domain"
## [53,] "D21" "domain"
## [54,] "D28" "domain"
## [55,] "D25" "domain"
## [56,] "D26" "domain"
## [57,] "D30" "domain"
## [58,] "D41" "domain"
## [59,] "D42" "domain"
## [60,] "D44" "domain"
## [61,] "D47" "domain"
## [62,] "D48" "domain"
## [63,] "D49" "domain"
## [64,] "D50" "domain"
## [65,] "D55" "domain"
## [66,] "D52" "domain"
## [67,] "D53" "domain"
## [68,] "D54" "domain"
##
## $`ligand-receptors`
## node attribute
## [1,] "L4" "ligand"
## [2,] "L3" "ligand"
## [3,] "L1" "ligand"
## [4,] "A3" "receptor"
## [5,] "B1" "receptor"
## [6,] "C1" "receptor"